Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis

نویسندگان

  • Shen Huang
  • Houfeng Wang
چکیده

Grammatical Error Diagnosis for Chinese has always been a challenge for both foreign learners and NLP researchers, for the variousity of grammar and the flexibility of expression. In this paper, we present a model based on Bidirectional Long Short-Term Memory(Bi-LSTM) neural networks, which treats the task as a sequence labeling problem, so as to detect Chinese grammatical errors, to identify the error types and to locate the error positions. In the corpora of this year’s shared task, there can be multiple errors in a single offset of a sentence, to address which, we simutaneously train three Bi-LSTM models sharing word embeddings which label Missing, Redundant and Selection errors respectively. We regard word ordering error as a special kind of word selection error which is longer during training phase, and then separate them by length during testing phase. In NLP-TEA 3 shared task for Chinese Grammatical Error Diagnosis(CGED), Our system achieved relatively high F1 for all the three levels in the traditional Chinese track and for the detection level in the Simpified Chinese track.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Grammatical Error Diagnosis Using Single Word Embedding

Automatic grammatical error detection for Chinese has been a big challenge for NLP researchers. Due to the formal and strict grammar rules in Chinese, it is hard for foreign students to master Chinese. A computer-assisted learning tool which can automatically detect and correct Chinese grammatical errors is necessary for those foreign students. Some of the previous works have sought to identify...

متن کامل

YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model

Building a system to detect Chinese grammatical errors is a challenge for naturallanguage processing researchers. As Chinese learners are increasing, developing such a system can help them study Chinese more easily. This paper introduces a bidirectional long short-term memory (BiLSTM) conditional random field (CRF) model to produce the sequences that indicate an error type for every position of...

متن کامل

CVTE at IJCNLP-2017 Task 1: Character Checking System for Chinese Grammatical Error Diagnosis Task

Grammatical error diagnosis is an important task in natural language processing. This paper introduces CVTE Character Checking System in the NLP-TEA-4 shared task for CGED 2017, we use Bi-LSTM to generate the probability of every character, then take two kinds of strategies to decide whether a character is correct or not. This system is probably more suitable to deal with the error type of bad ...

متن کامل

Neural Networks for Negation Cue Detection in Chinese

Negation cue detection involves identifying the span inherently expressing negation in a negative sentence. In Chinese, negative cue detection is complicated by morphological proprieties of the language. Previous work has shown that negative cue detection in Chinese can benefit from specific lexical and morphemic features, as well as cross-lingual information. We show here that they are not nec...

متن کامل

Chinese Grammatical Error Diagnosis with Long Short-Term Memory Networks

Grammatical error diagnosis is an important task in natural language processing. This paper introduces our Chinese Grammatical Error Diagnosis (CGED) system in the NLP-TEA-3 shared task for CGED. The CGED system can diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W). We treat the CGED task as a sequence lab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016